Data Science

Welcome to Cornell College!

Tyler George

What is Data Science?

Definition:

IBM: “Data science combines math and statistics, specialized programming, advanced analytics, artificial intelligence (AI) and machine learning with specific subject matter expertise to uncover actionable insights hidden in an organization’s data. These insights can be used to guide decision making and strategic planning.”

What does this mean?

Meet data science

  • Data science is an exciting discipline that allows you to turn raw data into understanding, insight, and knowledge.

Todays Software

R

An R shell

RStudio

An RStudio window

Data science life cycle

Data science life cycle

Data science life cycle

Import

Data science life cycle, with import highlighted

Tidy + transform

Data science life cycle, with tidy and transform highlighted

Visualize

Data science life cycle, with visualize highlighted

Model

Data science life cycle, with model highlighted

Understand

Data science life cycle, with understand highlighted

# A tibble: 5 × 2
  date             season
  <chr>            <chr> 
1 23 January 2017  winter
2 4 March 2017     spring
3 14 June 2017     summer
4 1 September 2017 fall  
5 ...              ...   

Communicate

Data science life cycle, with communicate highlighted

Understand + communicate

Data science life cycle, with understand and communicate highlighted

Program

Data science life cycle, with program highlighted

What you might actually be doing?

  • Business intelligence
  • Cybersecurity analysis
  • Data visualization

Sure, but what are you doing

  • Demand prediction for the manufacturing industry

  • Recommendation systems in marketing & advertising

  • Credit scoring for financial institutions

Job Growth

The US Department of Labor Statistics continues to project growth for Data Science

https://www.bls.gov/ooh/math/data-scientists.htm

Data Science at Cornell

Courses

  • Statistics Core
  • Introduction to Data Science
  • Computer Science
  • Statistical and Machine Learning

Competitions

  • Midwest Undergraduate Data Analytics Competition (MUDAC)

    • A marathon
  • MinneMUDAC

  • Both

    • Meet industry professionals
    • Present your work to faculty and data scientists at companies
    • Practice on real world applications

Research

  • Cornell Summer Research Institute
  • External Opportunities

Take a Walk

Let’s be Data Scientists

Activity Intro

What have you been listening too?

On your phone, answer the Google Survey at

bit.ly/MV_favs or scan the QR code.

Transition to RStudio